Simulation-based optimization and sensibility analysis of MPI applications: Variability matters
نویسندگان
چکیده
Finely tuning MPI applications and understanding the influence of key parameters (number processes, granularity, collective operation algorithms, virtual topology, process placement) is critical to obtain good performance on supercomputers. With high consumption running at scale, doing so solely optimize their particularly costly. Having inexpensive but faithful predictions expected could be a great help for researchers system administrators. The methodology we propose decouples complexity platform, which captured through statistical models its main components (MPI communications, BLAS operations), from adaptive by emulating application skipping regular non-MPI parts code. We demonstrate capability our method with High-Performance Linpack (HPL), benchmark used rank supercomputers in TOP500, requires careful tuning. briefly present (1) how open-source version HPL can slightly modified allow fast emulation single commodity server scale supercomputer. Then (2) an extensive (in)validation study that compares simulation real experiments demonstrates ability predict within few percent consistently. This allows us identify modeling pitfalls (e.g., spatial temporal node variability or network heterogeneity irregular behavior) need considered. Last, show (3) “surrogate” studying several subtle parameter optimization problems while accounting uncertainty platform.
منابع مشابه
Self-optimizing MPI Applications: A Simulation-Based Approach
Historically, high performance systems use schedulers and intelligent resource managers in order to optimize system usage and application performance. Most of the times, applications just issue requests of resources to the central system. This centralized approach is an unnecessary constraint for a class of potentially flexible applications, whose resource usage may be modulated as a function o...
متن کاملSensibility analysis of BGP convergence and scalability using network simulation
The Border Gateway Protocol (BGP) is the quasi-standard for the routing between autonomous systems in the Internet. Instabilities in the topology like a failing link can lead to a considerable delay in convergence times. Therefore it is necessary to gain a better understanding of the global dynamics and underlying mechanisms of BGP. In this work we perform a sensibility analysis of convergence ...
متن کاملfabrication of new ion sensitive field effect transistors (isfet) based on modification of junction-fet for analysis of hydronium, potassium and hydrazinium ions
a novel and ultra low cost isfet electrode and measurement system was designed for isfet application and detection of hydronium, hydrazinium and potassium ions. also, a measuring setup containing appropriate circuits, suitable analyzer (advantech board), de noise reduction elements, cooling system and pc was used for controlling the isfet electrode and various characteristic measurements. the t...
Simulation of MPI applications with time-independent traces
Analyzing and understanding the performance behavior of parallel applications on parallel computing platforms is a long-standing concern in the High Performance Computing community. When the targeted platforms are not available, simulation is a reasonable approach to obtain objective performance indicators and explore various hypothetical scenarios. In the context of applications implemented wi...
متن کاملSelf-optimization of MPI Applications Within an Autonomic Framework
An existing autonomic framework (MAWeS) can be used to provide run-time self-optimization for distributed applications. This paper introduces a new MAWeS Component that provides an interface for MPI applications. As case study, we will present the implementation of a dynamically-reconfigurable n-body solver, evaluating its obtained performance with and without the MAWeS framework under several ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of Parallel and Distributed Computing
سال: 2022
ISSN: ['1096-0848', '0743-7315']
DOI: https://doi.org/10.1016/j.jpdc.2022.04.002